Learning Stochastic Categorial Grammars
نویسندگان
چکیده
Stochastic categorial grammars (SCGs) are introduced as a more appropriate formalism for statistical language learners to est imate than stochastic context free grammars. As a vehicle for demonstrating SCG estimation, we show, in terms of crossing rates and in coverage, that when training material is limited, SCG estimation using the Minimum Description Length Principle is preferable to SCG estimation using an indifferent prior.
منابع مشابه
Stochastic Categorial Grammars
Statistical methods have turned out to be quite successful in natural language processing. During the recent years, several models of stochastic grammars have been proposed, including models based on lexicalised context-free grammars [3], tree adjoining grammars [15], or dependency grammars [2, 5]. In this exploratory paper, we propose a new model of stochastic grammar, whose originality derive...
متن کاملRigid Lambek Grammars Are Not Learnable from Strings
This paper is concerned with learning categorial grammars in Gold's model (Gold, 1967). Recently, learning algorithms in this model have been proposed for some particular classes of classical categorial grammars (Kanazawa, 1998). We show that in contrast to classical categorial grammars, rigid and k-valued Lambek grammars are not learnable from strings. This result holds for several variants of...
متن کاملConjoinability and unification in Lambek categorial grammars
Recently, learning algorithms in Gold’s model have been proposed for some particular classes of classical categorial grammars [Kan98]. We are interested here in learning Lambek categorial grammars. In general grammatical inference uses unification and substitution. In the context of Lambek categorial grammars it seems appropriate to incorporate an operation on types based both on deduction (Lam...
متن کاملCategorial Grammars with Iterated Types form a Strict Hierarchy of k-Valued Languages
The notion of k-valued categorial grammars where a word is associated to at most k types is often used in the field of lexicalized grammars as a fruitful constraint for obtaining several properties like the existence of learning algorithms. This principle is relevant only when the classes of k-valued grammars correspond to a real hierarchy of languages. Such a property had been shown earlier fo...
متن کاملLearning Rigid Lambek Grammars and Minimalist Grammars from Structured Sentences
We present an extension of Buszkowski’s learning algorithm for categorial grammars to rigid Lambek grammars and then for minimalist categorial grammars. The Kanazawa proof of the convergence in the Gold sense is simplified and extended to these new algorithms. We thus show that this technique based on principal type algorithm and type unification is quite general and applies to learning issues ...
متن کامل